Livestock diseases have a significant impact on animal health, production, and the economy of rural agricultural communities particularly in areas with intensive pig farming. The early and right diagnosis of disease is crucial to prevent the disease from spreading and to minimize financial losses. However, the traditional tracking method involves manual inspection as well as frequent veterinary examinations, which can result in delayed diagnosis and intervention. To overcome these issues, researchers designed a system which is based on machine learning to predict diseases in pig farms within Odisha, India, to identify common diseases affecting animals. Environmental, management and animal health parameters for a fake dataset that resembles a real farm were gathered in a simulated multi-year period. These parameters were temperature, humidity, rainfall, herd size, vaccination rate, biosecurity score, feed intake, water consumption, average weight and disease history. To make the data better and the model work better, data preparation included cleaning, normalizing, choosing features, and changing the data. Random Forest, Support Vector Machine, Decision Tree and K-Nearest Neighbour were used and compared several techniques of classification. Performance was evaluated by metrics such as accuracy, precision, memory, and F1-score. The Random Forest predictor outperformed the others in the experiments with an accuracy of 94.8%, precision of 94.1%, recall of 95.6% and an F1 score of 94.8%. The suggested framework improves the accuracy of disease risk predictions and helps with timely animal health management by making decisions more quickly.
Introduction
This study focuses on developing a Machine Learning (ML)-based disease prediction system for pig farming to improve livestock health management and support sustainable agriculture. Pig farming is an important source of food, income, and employment, particularly in developing countries. However, disease outbreaks remain a major challenge, causing economic losses and reducing productivity. Traditional disease monitoring methods, which rely on manual observation and veterinary inspections, are often slow and ineffective in detecting health problems early.
Background and Need
Animal diseases are influenced by multiple factors, including:
Conventional monitoring methods struggle to handle these complex and interconnected factors. Therefore, the study proposes an intelligent, data-driven system that can predict disease risks early, helping farmers make informed decisions and improve animal health.
Literature Review
Previous research has shown the potential of ML in livestock management:
Predicting ammonia emissions in pig houses.
Forecasting infectious diseases in poultry farms.
Monitoring pig feeding behavior to predict tail-biting incidents.
Predicting pig growth and weight gain.
Assessing biosecurity risks and detecting anomalies in farm operations.
Predicting disease occurrence using farm data and feeding patterns.
Although these studies demonstrated the effectiveness of ML, most focused on specific diseases, behaviors, or management issues. There is still a lack of a comprehensive system that combines environmental, management, and health-related factors for general disease prediction in pig farming, especially in developing agricultural regions.
Proposed System
The study develops an integrated ML framework that:
Collects farm-related information.
Predicts disease risks in pigs.
Supports early disease detection.
Provides health status assessments and decision support for farmers.
The system architecture includes:
Data collection from farm records.
Data preprocessing (cleaning, normalization, encoding).
Feature selection to identify important variables.
Training and testing multiple ML models.
Disease prediction and visualization through a web-based interface.
Dataset
A simulated dataset representing pig farms in Odisha, India, was used. It included:
Temperature
Humidity
Rainfall
Herd size
Vaccination rate
Biosecurity score
Water and feed consumption
Average animal weight
Disease history
Disease risk labels
The dataset was designed to mimic real farming conditions and support model development and testing.
Methodology
The workflow consists of:
Data Collection – Gathering environmental, management, and health-related information.
Data Preprocessing – Cleaning and normalizing data to improve quality.
Feature Selection – Identifying the most influential variables.
Model Development – Training different ML classifiers.
Model Evaluation – Comparing performance using standard metrics.
Disease Prediction and Visualization – Generating disease risk assessments and presenting results in an understandable format.
Machine Learning Algorithms Used
Random Forest (RF)
Ensemble-based method using multiple decision trees.
Handles complex relationships among variables.
Reduces overfitting and improves prediction accuracy.
Support Vector Machine (SVM)
Separates health conditions using optimal decision boundaries.
Effective for structured classification problems.
Decision Tree (DT)
Uses a hierarchical decision-making process.
Easy to interpret and understand.
K-Nearest Neighbour (KNN)
Predicts outcomes based on similarity with nearby observations.
Flexible and simple classification approach.
Performance Evaluation
Models were evaluated using:
Accuracy
Precision
Recall
F1-Score
Algorithm
Accuracy (%)
Precision (%)
Recall (%)
F1-Score (%)
Random Forest (RF)
94.8
94.1
95.6
94.8
Support Vector Machine (SVM)
91.6
90.8
92.7
91.7
Decision Tree (DT)
88.9
88.1
89.5
88.8
K-Nearest Neighbour (KNN)
86.7
85.9
87.4
86.6
Key Findings
Random Forest achieved the best performance, with an accuracy of 94.8% and the highest precision, recall, and F1-score.
The proposed system effectively predicts disease risks using farm-related data.
Integrating environmental, management, and animal health factors improves disease monitoring and prediction accuracy.
Conclusion
This system was primarily aimed at providing a more convenient way for animal health monitoring, by establishing a framework for disease prediction in pig farms through ML. The main objective was to help find disease risks earlier and rely less on traditional monitoring methods based on observations. The method provided a structured approach to assess disease risk, and assisted livestock managers to make informed decisions based on environmental conditions, indicators of farm management, information regarding livestock health and historical disease-related factors. Several models of disease classification were compared to determine which one is best to predict disease. The total performance of the model: RF was highest with an F1 score of 94.8%, and was the best model for classifying things. Its accuracy was 94.8%, its precision was 94.1%, its recall was 95.6%, and its F1-score was 94.8%. The use of feature selection boosted the quality of predictions and made them more understandable due to focusing on factors that are relevant to the occurrence of disease. In summary, the application demonstrates the potential of ML to help diagnose the health of animals and provides a reliable and scalable tool for farm timely monitoring and sustainable farming practices.
Future improvements to make it easier to use and to add more features to livestock disease monitoring tools are possible. The framework can be expanded to include additional monitoring tools that gather data from farms in real-time to enable continuous assessment of health. Use of larger datasets specific to the area may be easier to adapt to different farming and livestock situations. Other factors of the environment, nutrition, and health could be incorporated to obtain a better picture of the diseases. The system could be expanded to include other livestock and forecasting specific to the disease. Use of enhanced visualization and decision making tools will further make it easier to use and aid in livestock control.
References
[1] Yamsakul, P., Yano, T., Junchum, K., Anukool, W., & Kittiwan, N. (2025). Machine Learning-Based detection of pig coughs and their association with respiratory diseases in fattening pigs. Veterinary Sciences, 12(9), 818.
[2] Kavlak, A. T., Pastell, M., & Uimari, P. (2023). Disease detection in pigs based on feeding behaviour traits using machine learning. biosystems engineering, 226, 132-143.
[3] Halev, A., Martínez-López, B., Clavijo, M., Gonzalez-Crespo, C., Kim, J., Huang, C., ... & Liu, X. (2023). Infection prediction in swine populations with machine learning. Scientific Reports, 13(1), 17738.
[4] Paploski, I. A. D., Bhojwani, R. K., Sanhueza, J. M., Corzo, C. A., & VanderWaal, K. (2021). Forecasting viral disease outbreaks at the farm-level for commercial sow farms in the US. Preventive Veterinary Medicine, 196, 105449.
[5] Valdes-Donoso, P., VanderWaal, K., Jarvis, L. S., Wayne, S. R., & Perez, A. M. (2017). Using machine learning to predict swine movements within a regional program to improve control of infectious diseases in the US. Frontiers in veterinary science, 4, 2.
[6] Bakoev, S., Getmantseva, L., Kolosova, M., Kostyunina, O., Chartier, D. R., & Tatarinova, T. V. (2020). PigLeg: prediction of swine phenotype using machine learning. PeerJ, 8, e8764.
[7] Zhang, S., Su, Q., & Chen, Q. (2021). Application of machine learning in animal disease analysis and prediction. Current Bioinformatics, 16(7), 972-982.
[8] Nadeem, R. M., Bajwa, M. T. T., Mahmood, M., Saleem, R. M., & Maqbool, M. N. (2024). Machine learning-based prediction of african swine fever (ASF) in pigs. VFAST Transactions on Software Engineering, 12(3), 199-216.
[9] Suresh, K. P., Barman, N. N., Bari, T., Jagadish, D., Sushma, B., Darshan, H. V., ... & Deka, A. (2023). Application of machine learning models for risk estimation and risk prediction of classical swine fever in Assam, India. Virusdisease, 34(4), 514-525.
[10] Silva, G. S., Machado, G., Baker, K. L., Holtkamp, D. J., & Linhares, D. C. (2019). Machine-learning algorithms to identify key biosecurity practices and factors associated with breeding herds reporting PRRS outbreak. Preventive Veterinary Medicine, 171, 104749.
[11] Peng, S., Zhu, J., Liu, Z., Hu, B., Wang, M., & Pu, S. (2022). Prediction of ammonia concentration in a pig house based on machine learning models and environmental parameters. Animals, 13(1), 165.
[12] Liu, Y., Zhuang, Y., Yu, L., Li, Q., Zhao, C., Meng, R., ... & Guo, X. (2023). A machine learning framework based on extreme gradient boosting to predict the occurrence and development of infectious diseases in laying hen farms, taking H9N2 as an example. Animals, 13(9), 1494.
[13] Ollagnier, C., Kasper, C., Wallenbeck, A., Keeling, L., Bee, G., & Bigdeli, S. A. (2023). Machine learning algorithms can predict tail biting outbreaks in pigs using feeding behaviour records. PloS one, 18(1), e0252002.
[14] Taylor, C. (2024). Machine learning strategies for the forecasting of pig growth in industry (Doctoral dissertation, Newcastle University).
[15] Lee, W., Han, K. H., Kim, H. T., Choi, H., Ham, Y., & Ban, T. W. (2019). Prediction of average daily gain of swine based on machine learning. Journal of Intelligent & Fuzzy Systems, 36(2), 923-933.
[16] Sykes, A. L., Silva, G. S., Holtkamp, D. J., Mauch, B. W., Osemeke, O., Linhares, D. C., & Machado, G. (2022). Interpretable machine learning applied to on?farm biosecurity and porcine reproductive and respiratory syndrome virus. Transboundary and emerging diseases, 69(4), e916-e930.
[17] Park, H., Park, D., & Kim, S. (2021). Anomaly detection of operating equipment in livestock farms using deep learning techniques. Electronics, 10(16), 1958.
[18] Ghosh, P., & Mandal, S. N. (2022). PigB: Intelligent pig breeds classification using supervised machine learning algorithms. International Journal of Artificial Intelligence and Soft Computing, 7(3), 242-266.
[19] Halev, A., Martínez-López, B., Clavijo, M., Gonzalez-Crespo, C., Kim, J., Huang, C., ... & Liu, X. (2023). Infection prediction in swine populations with machine learning. Scientific Reports, 13(1), 17738.
[20] Kavlak, A. T., Pastell, M., & Uimari, P. (2023). Disease detection in pigs based on feeding behaviour traits using machine learning. biosystems engineering, 226, 132-143.